library

llama3.1

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.

117.7M Pulls 93 Tags Updated 1 year ago

deepseek-r1

DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.

tools thinking 1.5b 7b 8b 14b 32b 70b 671b

90.5M Pulls 35 Tags Updated 1 year ago

nomic-embed-text

A high-performing open embedding model with a large token context window.

embedding

79.8M Pulls 3 Tags Updated 2 years ago

llama3.2

Meta's Llama 3.2 goes small with 1B and 3B models.

tools 1b 3b

77.9M Pulls 63 Tags Updated 1 year ago

gemma3

The current, most capable model that runs on a single GPU.

vision 270m 1b 4b 12b 27b

38.9M Pulls 26 Tags Updated 11 months ago

qwen2.5

Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.

tools 0.5b 1.5b 3b 7b 14b 32b 72b

35.4M Pulls 133 Tags Updated 1 year ago

qwen3

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.

tools thinking 0.6b 1.7b 4b 8b 14b 30b 32b 235b

32.8M Pulls 58 Tags Updated 9 months ago

mistral

The 7B model released by Mistral AI, updated to version 0.3.

tools 7b

31.4M Pulls 84 Tags Updated 1 year ago

gemma2

Google Gemma 2 is a high-performing and efficient model available in three sizes: 2B, 9B, and 27B.

2b 9b 27b

28.6M Pulls 94 Tags Updated 1 year ago

llama3

Meta Llama 3: The most capable openly available LLM to date

8b 70b

24.8M Pulls 68 Tags Updated 2 years ago

gemma4

Gemma 4 models are designed to deliver frontier-level performance at each size. They are well-suited for reasoning, agentic workflows, coding, and multimodal understanding.

vision tools thinking audio cloud e2b e4b 12b 26b 31b

19.6M Pulls 49 Tags Updated 3 weeks ago

qwen2.5-coder

The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.

tools 0.5b 1.5b 3b 7b 14b 32b

18.9M Pulls 199 Tags Updated 1 year ago

phi3

Phi-3 is a family of lightweight 3B (Mini) and 14B (Medium) state-of-the-art open models by Microsoft.

3.8b 14b

17.9M Pulls 72 Tags Updated 1 year ago

qwen3.5

Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.

vision tools thinking cloud 0.8b 2b 4b 9b 27b 35b 122b

16.3M Pulls 64 Tags Updated 2 months ago

llava

🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.

vision 7b 13b 34b

14.4M Pulls 98 Tags Updated 2 years ago

mxbai-embed-large

State-of-the-art large embedding model from mixedbread.ai

embedding 335m

12.9M Pulls 4 Tags Updated 2 years ago

gpt-oss

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

tools thinking cloud 20b 120b

11.2M Pulls 5 Tags Updated 9 months ago

phi4

Phi-4 is a 14B parameter, state-of-the-art open model from Microsoft.

14b

7.6M Pulls 5 Tags Updated 1 year ago

qwen3-coder

Alibaba's performant long context models for agentic and coding tasks.

tools 30b 480b

7.6M Pulls 9 Tags Updated 10 months ago

qwen

Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters

0.5b 1.8b 4b 7b 14b 32b 72b 110b

7.4M Pulls 379 Tags Updated 2 years ago

gemma

Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1

2b 7b

7.3M Pulls 102 Tags Updated 2 years ago

llama2

Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.

7b 13b 70b

7.3M Pulls 102 Tags Updated 2 years ago

glm-ocr

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

vision tools

6.2M Pulls 3 Tags Updated 5 months ago

qwen2

Qwen2 is a new series of large language models from Alibaba group

tools 0.5b 1.5b 7b 72b

6.1M Pulls 97 Tags Updated 1 year ago

codellama

A large language model that can use text prompts to generate and discuss code.

7b 13b 34b 70b

5.8M Pulls 199 Tags Updated 2 years ago

mistral-nemo

A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.

tools 12b

5.5M Pulls 17 Tags Updated 1 year ago

bge-m3

BGE-M3 is a new model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.

embedding 567m

5.4M Pulls 3 Tags Updated 1 year ago

minicpm-v

A series of multimodal LLMs (MLLMs) designed for vision-language understanding.

vision 8b

5.3M Pulls 17 Tags Updated 1 year ago

tinyllama

The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.

1.1b

5.3M Pulls 36 Tags Updated 2 years ago

llama3.2-vision

Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.

vision 11b 90b

4.9M Pulls 9 Tags Updated 1 year ago

qwen3-vl

The most powerful vision-language model in the Qwen model family to date.

vision tools thinking 2b 4b 8b 30b 32b 235b

4.7M Pulls 57 Tags Updated 8 months ago

qwen3.6

Qwen3.6 delivers substantial upgrades in agentic coding and thinking preservation than previous Qwen models.

vision tools thinking 27b 35b

4.6M Pulls 30 Tags Updated 1 month ago

deepseek-coder

DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.

1.3b 6.7b 33b

4.4M Pulls 102 Tags Updated 2 years ago

llama3.3

New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.

tools 70b

4.1M Pulls 14 Tags Updated 1 year ago

dolphin3

Dolphin 3.0 Llama 3.1 8B 🐬 is the next generation of the Dolphin series of instruct-tuned models designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

8b

3.9M Pulls 5 Tags Updated 1 year ago

smollm2

SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.

tools 135m 360m 1.7b

3.9M Pulls 49 Tags Updated 1 year ago

deepseek-v3

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

671b

3.8M Pulls 5 Tags Updated 1 year ago

olmo2

OLMo 2 is a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently sized fully open models, and competitive with open-weight models such as Llama 3.1 on English academic benchmarks.

7b 13b

3.7M Pulls 9 Tags Updated 1 year ago

qwen2.5vl

Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.

vision 3b 7b 32b 72b

3.5M Pulls 17 Tags Updated 1 year ago

all-minilm

Embedding models on very large sentence level datasets.

embedding 22m 33m

3.3M Pulls 10 Tags Updated 2 years ago

snowflake-arctic-embed

A suite of text embedding models by Snowflake, optimized for performance.

embedding 22m 33m 110m 137m 335m

3.1M Pulls 16 Tags Updated 2 years ago

codegemma

CodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following.

2b 7b

3.1M Pulls 85 Tags Updated 2 years ago

mistral-small

Mistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.

tools 22b 24b

3.1M Pulls 21 Tags Updated 1 year ago

granite3.1-moe

The IBM Granite 1B and 3B models are long-context mixture of experts (MoE) Granite models from IBM designed for low latency usage.

tools 1b 3b

3M Pulls 33 Tags Updated 1 year ago

orca-mini

A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.

3b 7b 13b 70b

3M Pulls 119 Tags Updated 2 years ago

starcoder2

StarCoder2 is the next generation of transparently trained open code LLMs that comes in three sizes: 3B, 7B and 15B parameters.

3b 7b 15b

2.9M Pulls 67 Tags Updated 1 year ago

deepseek-coder-v2

An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

16b 236b

2.9M Pulls 64 Tags Updated 1 year ago

qwen3-embedding

Building upon the foundational models of the Qwen3 series, Qwen3 Embedding provides a comprehensive range of text embeddings models in various sizes

embedding 0.6b 4b 8b

2.8M Pulls 12 Tags Updated 10 months ago

nemotron-3-super

NVIDIA Nemotron 3 Super is a 120B open MoE model activating just 12B parameters to deliver maximum compute efficiency and accuracy for complex multi-agent applications.

tools thinking cloud 120b

2.8M Pulls 7 Tags Updated 4 months ago

mixtral

A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.

tools 8x7b 8x22b

2.8M Pulls 70 Tags Updated 1 year ago

llama2-uncensored

Uncensored Llama 2 model by George Sung and Jarrad Hope.

7b 70b

2.6M Pulls 34 Tags Updated 2 years ago

falcon3

A family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.

1b 3b 7b 10b

2.6M Pulls 17 Tags Updated 1 year ago

minimax-m2.7

MiniMax's M2-series model for coding, agentic workflows, and professional productivity.

tools thinking cloud

2.3M Pulls 1 Tag Updated 4 months ago

mistral-small3.2

An update to Mistral Small that improves on function calling, instruction following, and less repetition errors.

vision tools 24b

2.3M Pulls 5 Tags Updated 1 year ago

minimax-m2.5

MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.

tools thinking cloud

2.3M Pulls 1 Tag Updated 5 months ago

llava-llama3

A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.

vision 8b

2.3M Pulls 4 Tags Updated 2 years ago

glm-5.1

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.

tools thinking cloud

2.3M Pulls 1 Tag Updated 3 months ago

qwq

QwQ is the reasoning model of the Qwen series.

tools 32b

2.3M Pulls 8 Tags Updated 1 year ago

gemini-3-flash-preview

Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.

vision tools thinking cloud

2.3M Pulls 1 Tag Updated 7 months ago

cogito

Cogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.

tools 3b 8b 14b 32b 70b

2.1M Pulls 20 Tags Updated 1 year ago

dolphin-llama3

Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.

8b 70b

2M Pulls 53 Tags Updated 2 years ago

smollm

🪐 A family of small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset.

135m 360m 1.7b

2M Pulls 94 Tags Updated 1 year ago

gemma3n

Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones.

e2b e4b

1.9M Pulls 9 Tags Updated 1 year ago

qwen3-coder-next

Qwen3-Coder-Next is a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.

tools

1.9M Pulls 3 Tags Updated 5 months ago

translategemma

A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.

vision 4b 12b 27b

1.9M Pulls 13 Tags Updated 6 months ago

dolphin-mixtral

Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.

8x7b 8x22b

1.8M Pulls 70 Tags Updated 1 year ago

llama4

Meta's latest collection of multimodal models.

vision tools 16x17b 128x17b

1.8M Pulls 11 Tags Updated 1 year ago

phi4-reasoning

Phi 4 reasoning and reasoning plus are 14-billion parameter open-weight reasoning models that rival much larger models on complex reasoning tasks.

14b

1.7M Pulls 9 Tags Updated 1 year ago

dolphin-mistral

The uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.

7b

1.6M Pulls 120 Tags Updated 2 years ago

dolphin-phi

2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.

2.7b

1.6M Pulls 15 Tags Updated 2 years ago

hermes3

Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research

tools 3b 8b 70b 405b

1.5M Pulls 65 Tags Updated 1 year ago

phi

Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.

2.7b

1.5M Pulls 18 Tags Updated 2 years ago

command-r

Command R is a Large Language Model optimized for conversational interaction and long context tasks.

tools 35b

1.5M Pulls 32 Tags Updated 1 year ago

embeddinggemma

EmbeddingGemma is a 300M parameter embedding model from Google.

embedding 300m

1.5M Pulls 5 Tags Updated 10 months ago

moondream

moondream2 is a small vision language model designed to run efficiently on edge devices.

vision 1.8b

1.5M Pulls 18 Tags Updated 2 years ago

granite-code

A family of open foundation models by IBM for Code Intelligence

3b 8b 20b 34b

1.4M Pulls 162 Tags Updated 1 year ago

magistral

Magistral is a small, efficient reasoning model with 24B parameters.

tools thinking 24b

1.4M Pulls 5 Tags Updated 1 year ago

glm-4.7-flash

As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.

tools thinking

1.4M Pulls 4 Tags Updated 1 month ago

granite4

Granite 4 features improved instruction following (IF) and tool-calling capabilities, making them more effective in enterprise applications.

tools 350m 1b 3b

1.4M Pulls 17 Tags Updated 9 months ago

ministral-3

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.

vision tools 3b 8b 14b

1.3M Pulls 13 Tags Updated 7 months ago

sqlcoder

SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks

7b 15b

1.3M Pulls 48 Tags Updated 2 years ago

yi

Yi 1.5 is a high-performing, bilingual language model.

6b 9b 34b

1.3M Pulls 174 Tags Updated 2 years ago

phi4-mini

Phi-4-mini brings significant enhancements in multilingual support, reasoning, and mathematics, and now, the long-awaited function calling feature is finally supported.

tools 3.8b

1.3M Pulls 5 Tags Updated 1 year ago

codestral

Codestral is Mistral AI’s first-ever code model designed for code generation tasks.

22b

1.3M Pulls 17 Tags Updated 1 year ago

deepscaler

A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.

1.5b

1.3M Pulls 5 Tags Updated 1 year ago

mistral-large

Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.

tools 123b

1.3M Pulls 32 Tags Updated 1 year ago

lfm2.5-thinking

LFM2.5 is a new family of hybrid models designed for on-device deployment.

tools thinking 1.2b

1.2M Pulls 5 Tags Updated 6 months ago

wizard-vicuna-uncensored

Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.

7b 13b 30b

1.2M Pulls 49 Tags Updated 2 years ago

zephyr

Zephyr is a series of fine-tuned versions of the Mistral and Mixtral models that are trained to act as helpful assistants.

7b 141b

1.2M Pulls 40 Tags Updated 2 years ago

openchat

A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-0106.

7b

1.2M Pulls 50 Tags Updated 2 years ago

starcoder

StarCoder is a code generation model trained on 80+ programming languages.

1b 3b 7b 15b

1.2M Pulls 100 Tags Updated 2 years ago

glm4

A strong multi-lingual general language model with competitive performance to Llama 3.

9b

1.2M Pulls 32 Tags Updated 2 years ago

wizardlm2

State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.

7b 8x22b

1.2M Pulls 22 Tags Updated 2 years ago

deepseek-v2

A strong, economical, and efficient Mixture-of-Experts language model.

16b 236b

1.2M Pulls 34 Tags Updated 2 years ago

nous-hermes

General use models based on Llama and Llama 2 from Nous Research.

7b 13b

1.2M Pulls 63 Tags Updated 2 years ago

deepseek-llm

An advanced language model crafted with 2 trillion bilingual tokens.

7b 67b

1.1M Pulls 64 Tags Updated 2 years ago

openthinker

A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.

7b 32b

1.1M Pulls 15 Tags Updated 1 year ago

vicuna

General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.

7b 13b 33b

1.1M Pulls 111 Tags Updated 2 years ago

falcon

A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.

7b 40b 180b

1.1M Pulls 38 Tags Updated 2 years ago

lfm2

LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

tools 24b

1.1M Pulls 6 Tags Updated 5 months ago

openhermes

OpenHermes 2.5 is a 7B model fine-tuned by Teknium on Mistral with fully open datasets.

1.1M Pulls 35 Tags Updated 2 years ago

codeqwen

CodeQwen1.5 is a large language model pretrained on a large amount of code data.

7b

1.1M Pulls 30 Tags Updated 2 years ago

qwen2-math

Qwen2 Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).

1.5b 7b 72b

1.1M Pulls 52 Tags Updated 1 year ago

granite3.3

IBM Granite 2B and 8B models are 128K context length language models that have been fine-tuned for improved reasoning and instruction-following capabilities.

tools 2b 8b

1.1M Pulls 3 Tags Updated 1 year ago

aya

Aya 23, released by Cohere, is a new family of state-of-the-art, multilingual models that support 23 languages.

8b 35b

1.1M Pulls 33 Tags Updated 2 years ago

nous-hermes2

The powerful family of models by Nous Research that excels at scientific discussion and coding tasks.

10.7b 34b

1.1M Pulls 33 Tags Updated 2 years ago

neural-chat

A fine-tuned model based on Mistral with good coverage of domain and language.

7b

1.1M Pulls 50 Tags Updated 2 years ago

llama2-chinese

Llama 2 based model fine tuned to improve Chinese dialogue ability.

7b 13b

1M Pulls 35 Tags Updated 2 years ago

stable-code

Stable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.

3b

1M Pulls 36 Tags Updated 2 years ago

yi-coder

Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.

1.5b 9b

1M Pulls 67 Tags Updated 1 year ago

wizardcoder

State-of-the-art code generation model

33b

1M Pulls 67 Tags Updated 2 years ago

stablelm2

Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.

1.6b 12b

1M Pulls 84 Tags Updated 2 years ago

granite3-dense

The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.

tools 2b 8b

996.8K Pulls 33 Tags Updated 1 year ago

llama3-chatqa

A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).

8b 70b

996.7K Pulls 35 Tags Updated 2 years ago

llama-guard3

Llama Guard 3 is a series of models fine-tuned for content safety classification of LLM inputs and responses.

1b 8b

994.9K Pulls 33 Tags Updated 1 year ago

granite3.1-dense

The IBM Granite 2B and 8B models are text-only dense LLMs trained on over 12 trillion tokens of data, demonstrated significant improvements over their predecessors in performance and speed in IBM’s initial testing.

tools 2b 8b

994.8K Pulls 33 Tags Updated 1 year ago

phi3.5

A lightweight AI model with 3.8 billion parameters with performance overtaking similarly and larger sized models.

3.8b

992.6K Pulls 17 Tags Updated 1 year ago

devstral

Devstral: the best open source model for coding agents

tools 24b

988.5K Pulls 5 Tags Updated 1 year ago

wizard-math

Model focused on math and logic problems

7b 13b 70b

988.1K Pulls 64 Tags Updated 2 years ago

dolphincoder

A 7B and 15B uncensored variant of the Dolphin model family that excels at coding, based on StarCoder2.

7b 15b

980K Pulls 35 Tags Updated 2 years ago

aya-expanse

Cohere For AI's language models trained to perform well across 23 different languages.

tools 8b 32b

976.2K Pulls 33 Tags Updated 1 year ago

internlm2

InternLM2.5 is a 7B parameter model tailored for practical scenarios with outstanding reasoning capability.

1m 1.8b 7b 20b

976.1K Pulls 65 Tags Updated 1 year ago

llama3-gradient

This model extends LLama-3 8B's context length from 8k to over 1m tokens.

8b 70b

973.6K Pulls 35 Tags Updated 2 years ago

samantha-mistral

A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.

7b

972.3K Pulls 49 Tags Updated 2 years ago

llama3-groq-tool-use

A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.

tools 8b 70b

967.4K Pulls 33 Tags Updated 2 years ago

xwinlm

Conversational model based on Llama 2 that performs competitively on various benchmarks.

7b 13b

952.7K Pulls 80 Tags Updated 2 years ago

granite3.2-vision

A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

vision tools 2b

951.4K Pulls 5 Tags Updated 1 year ago

starling-lm

Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.

7b

946.1K Pulls 36 Tags Updated 2 years ago

phind-codellama

Code generation model based on Code Llama.

34b

943.7K Pulls 49 Tags Updated 2 years ago

solar

A compact, yet powerful 10.7B large language model designed for single-turn conversation.

10.7b

940.9K Pulls 32 Tags Updated 2 years ago

yarn-llama2

An extension of Llama 2 that supports a context of up to 128k tokens.

7b 13b

937.8K Pulls 67 Tags Updated 2 years ago

paraphrase-multilingual

Sentence-transformers model that can be used for tasks like clustering or semantic search.

embedding 278m

930.3K Pulls 3 Tags Updated 1 year ago

granite3-moe

The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.

tools 1b 3b

927.7K Pulls 33 Tags Updated 1 year ago

devstral-small-2

24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

vision tools 24b

916.5K Pulls 6 Tags Updated 7 months ago

orca2

Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.

7b 13b

910.7K Pulls 33 Tags Updated 2 years ago

stable-beluga

Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.

7b 13b 70b

909.1K Pulls 49 Tags Updated 2 years ago

reader-lm

A series of models that convert HTML content to Markdown content, which is useful for content conversion tasks.

0.5b 1.5b

901.9K Pulls 33 Tags Updated 1 year ago

deepcoder

DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.

1.5b 14b

900.4K Pulls 9 Tags Updated 1 year ago

shieldgemma

ShieldGemma is set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies.

2b 9b 27b

900.3K Pulls 49 Tags Updated 1 year ago

llama-pro

An expansion of Llama 2 that specializes in integrating both general language understanding and domain-specific knowledge, particularly in programming and mathematics.

884K Pulls 33 Tags Updated 2 years ago

yarn-mistral

An extension of Mistral to support context windows of 64K or 128K.

7b

878K Pulls 33 Tags Updated 2 years ago

nexusraven

Nexus Raven is a 13B instruction tuned model for function calling tasks.

13b

873.4K Pulls 32 Tags Updated 2 years ago

wizardlm

General use model based on Llama 2.

868.6K Pulls 73 Tags Updated 2 years ago

bakllava

BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.

vision 7b

868.5K Pulls 17 Tags Updated 2 years ago

meditron

Open-source medical large language model adapted from Llama 2 to the medical domain.

7b 70b

782.1K Pulls 22 Tags Updated 2 years ago

command-r-plus

Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.

tools 104b

781.7K Pulls 21 Tags Updated 1 year ago

mistral-small3.1

Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.

vision tools 24b

766.6K Pulls 5 Tags Updated 1 year ago

exaone-deep

EXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.

2.4b 7.8b 32b

753.1K Pulls 13 Tags Updated 1 year ago

deepseek-v3.1

DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.

tools thinking 671b

711.7K Pulls 7 Tags Updated 10 months ago

tinydolphin

An experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.

1.1b

708.3K Pulls 18 Tags Updated 2 years ago

nemotron-mini

A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.

tools 4b

690.9K Pulls 17 Tags Updated 1 year ago

codegeex4

A versatile model for AI software development scenarios, including code completion.

9b

671.9K Pulls 17 Tags Updated 2 years ago

mistral-openorca

Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.

7b

666K Pulls 17 Tags Updated 2 years ago

nemotron-3-nano

Nemotron-3-Nano is a new Standard for Efficient, Open, and Intelligent Agentic Models, now updated with a 4B parameter count model.

tools thinking cloud 4b 30b

632.1K Pulls 9 Tags Updated 4 months ago

wizardlm-uncensored

Uncensored version of Wizard LM model

13b

628.3K Pulls 18 Tags Updated 2 years ago

nemotron3

NVIDIA Nemotron 3 Nano Omni is a multimodal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows.

vision tools thinking audio 33b

623.7K Pulls 4 Tags Updated 2 months ago

medllama2

Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.

7b

621.2K Pulls 17 Tags Updated 2 years ago

opencoder

OpenCoder is an open and reproducible code LLM family which includes 1.5B and 8B models, supporting chat in English and Chinese languages.

1.5b 8b

610.3K Pulls 9 Tags Updated 1 year ago

reflection

A high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

70b

603.1K Pulls 17 Tags Updated 1 year ago

nemotron

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

tools 70b

599K Pulls 17 Tags Updated 1 year ago

nous-hermes2-mixtral

The Nous Hermes 2 model from Nous Research, now trained over Mixtral.

8x7b

576.7K Pulls 18 Tags Updated 1 year ago

codeup

Great code generation model based on Llama2.

13b

576.4K Pulls 19 Tags Updated 2 years ago

athene-v2

Athene-V2 is a 72B parameter model which excels at code completion, mathematics, and log extraction tasks.

tools 72b

575.6K Pulls 17 Tags Updated 1 year ago

qwen3-next

The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.

tools thinking 80b

575.3K Pulls 9 Tags Updated 7 months ago

nomic-embed-text-v2-moe

nomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.

embedding

565.3K Pulls 1 Tag Updated 7 months ago

megadolphin

MegaDolphin-2.2-120b is a transformation of Dolphin-2.2-70b created by interleaving the model with itself.

120b

554.8K Pulls 19 Tags Updated 2 years ago

everythinglm

Uncensored Llama2 based model with support for a 16K context window.

13b

551.5K Pulls 18 Tags Updated 2 years ago

solar-pro

Solar Pro Preview: an advanced large language model (LLM) with 22 billion parameters designed to fit into a single GPU

22b

544.2K Pulls 18 Tags Updated 1 year ago

magicoder

🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.

7b

541.1K Pulls 18 Tags Updated 2 years ago

mathstral

MathΣtral: a 7B model designed for math reasoning and scientific discovery by Mistral AI.

7b

534.7K Pulls 17 Tags Updated 2 years ago

exaone3.5

EXAONE 3.5 is a collection of instruction-tuned bilingual (English and Korean) generative models ranging from 2.4B to 32B parameters, developed and released by LG AI Research.

2.4b 7.8b 32b

532.9K Pulls 13 Tags Updated 1 year ago

falcon2

Falcon2 is an 11B parameters causal decoder-only model built by TII and trained over 5T tokens.

11b

526.5K Pulls 17 Tags Updated 2 years ago

notus

A 7B chat model fine-tuned with high-quality data and based on Zephyr.

7b

525.6K Pulls 18 Tags Updated 2 years ago

notux

A top-performing mixture of experts model, fine-tuned with high-quality data.

8x7b

524.7K Pulls 18 Tags Updated 2 years ago

nuextract

A 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.

3.8b

521.4K Pulls 17 Tags Updated 2 years ago

stablelm-zephyr

A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.

3b

519.5K Pulls 17 Tags Updated 2 years ago

bespoke-minicheck

A state-of-the-art fact-checking model developed by Bespoke Labs.

7b

514.7K Pulls 17 Tags Updated 1 year ago

duckdb-nsql

7B parameter text-to-SQL model made by MotherDuck and Numbers Station.

7b

514K Pulls 17 Tags Updated 2 years ago

mistrallite

MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.

7b

510.7K Pulls 17 Tags Updated 2 years ago

firefunction-v2

An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.

tools 70b

509.6K Pulls 17 Tags Updated 2 years ago

wizard-vicuna

Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.

13b

504.8K Pulls 17 Tags Updated 2 years ago

deepseek-ocr

DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.

vision 3b

497.8K Pulls 3 Tags Updated 8 months ago

open-orca-platypus2

Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.

13b

497.1K Pulls 17 Tags Updated 2 years ago

rnj-1

Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.

tools 8b

488.8K Pulls 5 Tags Updated 7 months ago

codebooga

A high-performing code instruct model created by merging two existing code models.

34b

484.4K Pulls 16 Tags Updated 2 years ago

goliath

A language model created by combining two fine-tuned Llama 2 70B models into one.

467.1K Pulls 16 Tags Updated 2 years ago

granite3.2

Granite-3.2 is a family of long-context AI models from IBM Granite fine-tuned for thinking capabilities.

tools 2b 8b

446.1K Pulls 9 Tags Updated 1 year ago

olmo-3

Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.

tools thinking 7b 32b

444K Pulls 15 Tags Updated 7 months ago

snowflake-arctic-embed2

Snowflake's frontier embedding model. Arctic Embed 2.0 adds multilingual support without sacrificing English performance or scalability.

embedding 568m

427.9K Pulls 3 Tags Updated 1 year ago

r1-1776

A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.

70b 671b

412.6K Pulls 9 Tags Updated 1 year ago

sailor2

Sailor2 are multilingual language models made for South-East Asia. Available in 1B, 8B, and 20B parameter sizes.

1b 8b 20b

401.9K Pulls 13 Tags Updated 1 year ago

kimi-k2.6

Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration.

vision tools thinking cloud

400.1K Pulls 1 Tag Updated 3 months ago

tulu3

Tülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes by the The Allen Institute for AI.

8b 70b

374.9K Pulls 9 Tags Updated 1 year ago

kimi-k2.5

Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.

vision tools thinking cloud

368.9K Pulls 1 Tag Updated 6 months ago

dbrx

DBRX is an open, general-purpose LLM created by Databricks.

132b

344.1K Pulls 7 Tags Updated 2 years ago

granite-embedding

The IBM Granite Embedding 30M and 278M models models are text-only dense biencoder embedding models, with 30M available in English only and 278M serving multilingual use cases.

embedding 30m 278m

342.9K Pulls 6 Tags Updated 1 year ago

devstral-2

123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

tools 123b

332K Pulls 5 Tags Updated 7 months ago

granite3-guardian

The IBM Granite Guardian 3.0 2B and 8B models are designed to detect risks in prompts and/or responses.

2b 8b

326.5K Pulls 10 Tags Updated 1 year ago

minimax-m3

MiniMax M3: Coding & Agentic Frontier. 1M context window. Native Multimodality.

vision tools thinking cloud

314.6K Pulls 1 Tag Updated 1 month ago

ornith

A self-improving family of open-source models for agentic coding

9b 35b

298.8K Pulls 9 Tags Updated 4 weeks ago

llava-phi3

A new small LLaVA model fine-tuned from Phi 3 Mini.

vision 3.8b

298.7K Pulls 4 Tags Updated 2 years ago

phi4-mini-reasoning

Phi 4 mini reasoning is a lightweight open model that balances efficiency with advanced reasoning ability.

3.8b

283.6K Pulls 5 Tags Updated 1 year ago

command-r7b

The smallest model in Cohere's R series delivers top-tier speed, efficiency, and quality to build powerful AI applications on commodity GPUs and edge devices.

tools 7b

283.5K Pulls 5 Tags Updated 1 year ago

deepseek-v2.5

An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.

236b

282.2K Pulls 7 Tags Updated 1 year ago

olmo-3.1

Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.

tools thinking 32b

282.1K Pulls 10 Tags Updated 7 months ago

deepseek-v4-pro

DeepSeek-V4-Pro is a frontier Mixture-of-Experts model with a large context window and three reasoning modes.

tools thinking cloud

281.2K Pulls 1 Tag Updated 3 months ago

granite4.1

IBM Granite Models are a family of enterprise-ready, open foundation models that support multilingual capabilities, coding, retrieval-augmented generation (RAG), tool use, and structured JSON output. Released under Apache 2.0 license.

tools 3b 8b 30b

281K Pulls 48 Tags Updated 2 months ago

deepseek-v4-flash

DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.

tools thinking cloud

277.4K Pulls 1 Tag Updated 3 months ago

bge-large

Embedding model from BAAI mapping texts to vectors.

embedding 335m

276.5K Pulls 3 Tags Updated 1 year ago

glm-5.2

GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks.

tools thinking cloud

267.1K Pulls 1 Tag Updated 1 month ago

smallthinker

A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.

3b

252.9K Pulls 5 Tags Updated 1 year ago

mistral-medium-3.5

Mistral Medium 3.5 is the first flagship model of Mistral AI that merged instruction-following, reasoning, and coding in a single set of 128B weights.

vision tools thinking 128b

240.4K Pulls 5 Tags Updated 2 months ago

alfred

A robust conversational model designed to be used for both chat and instruct use cases.

40b

234.6K Pulls 7 Tags Updated 2 years ago

command-a

111 billion parameter model optimized for demanding enterprises that require fast, secure, and high-quality AI

tools 111b

222K Pulls 5 Tags Updated 1 year ago

marco-o1

An open large reasoning model for real-world solutions by the Alibaba International Digital Commerce Group (AIDC-AI).

7b

208K Pulls 5 Tags Updated 1 year ago

medgemma

MedGemma is a collection of Gemma 3 variants that are trained for performance on medical text and image comprehension.

vision 4b 27b

204.6K Pulls 9 Tags Updated 3 months ago

cogito-2.1

The Cogito v2.1 LLMs are instruction tuned generative models. All models are released under MIT license for commercial use.

671b

202.5K Pulls 5 Tags Updated 8 months ago

command-r7b-arabic

A new state-of-the-art version of the lightweight Command R7B model that excels in advanced Arabic language capabilities for enterprises in the Middle East and Northern Africa.

tools 7b

198.5K Pulls 5 Tags Updated 1 year ago

kimi-k2.7-code

Kimi K2.7 Code is Moonshot AI's coding-focused agentic model built upon Kimi K2.6, with substantial improvements on real-world long-horizon coding tasks and roughly 30% lower thinking-token usage.

vision tools thinking cloud

187.4K Pulls 1 Tag Updated 1 month ago

functiongemma

FunctionGemma is a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.

tools 270m

174.7K Pulls 4 Tags Updated 7 months ago

gpt-oss-safeguard

gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are safety reasoning models built-upon gpt-oss

tools thinking 20b 120b

148.6K Pulls 3 Tags Updated 9 months ago

nemotron-cascade-2

An open 30B MoE model from NVIDIA with 3B activated parameters that delivers strong reasoning and agentic capabilities.

tools thinking 30b

136.3K Pulls 3 Tags Updated 4 months ago

medgemma1.5

MedGemma 1.5 4B is an updated version of the MedGemma 4B model.

vision 4b

96.4K Pulls 5 Tags Updated 3 months ago

lfm2.5

LFM2.5-8B-A1B, an edge model built for fast, reliable tool calling on consumer hardware.

tools thinking 8b

92.8K Pulls 5 Tags Updated 1 month ago

mistral-large-3

A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.

vision tools cloud

78.5K Pulls 1 Tag Updated 7 months ago

laguna-xs-2.1

Laguna XS 2.1 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.

tools thinking

71.4K Pulls 7 Tags Updated yesterday

north-mini-code-1.0

North Mini Code is Cohere's first model for developers — a 30B Mixture-of-Experts model with 3B active parameters, built for agentic software engineering.

tools thinking

31.1K Pulls 7 Tags Updated 1 month ago

nemotron-3-ultra

NVIDIA Nemotron 3 Ultra is built for high-throughput reasoning and long-running agent workflows.

tools thinking cloud

30.2K Pulls 1 Tag Updated 1 month ago

laguna-xs.2

Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.

tools thinking

22.4K Pulls 7 Tags Updated yesterday

minicpm-v4.6

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

vision 1b

21.7K Pulls 13 Tags Updated 1 month ago

minicpm-v4.5

A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

vision 8b

17.4K Pulls 13 Tags Updated 1 month ago

laguna-s-2.1

Our most capable model to date, designed for long-horizon work. 70.2% on Terminal-Bench 2.1 at 118B-A8B.

tools thinking

9,090 Pulls 7 Tags Updated yesterday

granite4.1-guardian

Granite Guardian 4.1 is a specialized safety and judging model from IBM Research that evaluates whether LLM prompts and responses meet specified harm criteria.

tools thinking 8b

5,852 Pulls 16 Tags Updated 1 month ago